Cooking The Code - Felix John COLIBRI. |
- abstract : a source code filtering utility which removes unwanted rows, areas and comments. Complete with Unit Test.
- key words : code transformation - code filtering - mark up - tStringList handling - unit test
- software used : Windows XP Home, Delphi 6
- hardware used : Pentium 2.800Mhz, 512 M memory, 140 G hard disc
- scope : Delphi 1 to 2006, Turbo Delphi for Windows, Kylix
Delphi 5, 6, 7, 8 Delphi 2005, 2006, Turbo Delphi, Turbo 2007, Rad Studio 2007, Rad Studio 2009 - level : Delphi developer
- plan :
1 - Why Source Code Filtering ? For most of our customers, we provide the source code of our contracted
projects. For many of our customer applications, we use a developer version that includes parts which are of no interest the customer : - trial code
- developper comments (todo list, add this, contact such and such for review,
who corrected this little bit, see also some other project)
- debug logging which could be more voluminous than usage logging
Before delivering the source, we could manually remove the unwanted parts, but,
as with other "dual code" endeavours, this process is error prone and tedious: - we could miss some parts
- we could remove too many lines
- maintaining two versions of the code over time can become a nightmare
Therefore we use a small utility which removes the unwanted parts. This process is performed using simple innocuous comment markers and a companion filtering parser.
2 - The Delphi Code Cooker
2.1 - The Objective and the Grammar The easiest way is to present a "before / after" example. Here is a small piece of code :
Let's assume that we want to remove - isolated rows or groups of rows
- comments
To do so, we add markers, with the following effect (on the left the original, on the right, the filtered text): - to remove a row, we append "//-" at the end of the row
unit _cooker_test; | unit _cooker_test;
interface | interface
implementation | implementation
BBB | BBB
aaa //- |
CCC | CCC
| - to remove row groups
- "//>" and "//<"
- either surrounding those lines
CCC | CCC
//> |
bbb |
ccc |
//< |
DDD | DDD
| - or at the end of the first and the last line
DDD | DDD
ddd //> |
eee |
fff //< |
EEE | EEE
| - for the other types of comments
- first the "//" comments:
- the special "// --- " is used to keep a comment
EEE | EEE
// --- FFF | // --- FFF
GGG | GGG
| - all other start of line "//" comments will be removed
GGG | GGG
// |
// ggg |
// -- hhh |
HHH | HHH
| - the end of line // comments are removed, except those after BEGIN and END;
HHH | HHH
III // -- jjj | III
BEGIN // JJJ | BEGIN // JJJ
END; // KKK | END; // KKK
LLL | LLL
| - for the the "(*" comments
- the comments starting with (* on a line by itself will be removed
JJJ | JJJ
(* |
jjj |
kkk *) |
(* |
ppp *) QQQ | QQQ
| - other "(*" comments will remain
(* KKK | (* KKK
LLL *) | LLL*)
(*$R+*) | (*$R+*)
MMM (* NNN | MMM (* NNN
OOO *) PPP | OOO *) PPP
| - the rules are the same for "{"
- the comments starting with { on a line by itself will be removed
JJJ | JJJ
{ |
jjj |
kkk } |
{ |
ppp } QQQ | QQQ
| - other "{" comments will remain
{ KKK | { KKK
LLL } | LLL}
{$R+} | {$R+}
MMM { NNN | MMM { NNN
OOO } PPP | OOO } PPP
| Please note that - the rules are somehow dependent on our coding habits. For instance we use
- "//" comments for explanations, and with two dashes, "// -- ", to hilite them from code
- end of line "//" comments are to exclude previous values:
- "(*" comments for contiguous bloc elimination (from the compilation). Those blocs can contain "//" comments. In addition, when we exclude some bloc from compilation, we place the "(*" and "*)" at the margin, on a
separate line :
x:= 5; (*
// -- this is an explanation comment y:= 8; *) a:= b; |
- "{" for non contiguous bloc elimination. So "{" comments can enclose several "(*" blocs
a:= b; { c:= d; (* e:= f; *)
g:= h; (* i:= j; *) } k:= l; |
- this explains why we chose
- the "//>" "//<" "//-" "// --- " as code cooking markers, since we never use those in our usual code
- the "(*" and "{" on single lines to remove blocs. If we decide to keep in
the cooked code some commented out bloc, we simply add any character after the "(*" or "{" (like "(*+" for instance)
- for commented out code, we use "(*" on a single line, and also the matching
"*)" on a line by itself. But to stay coherent with the compiler, we accept that the matching "*)" be placed anywhere in a line
x:= 5; (* y:= 8; *) a:= b;
(* d:= 9; *) e:= 18 f:= 15; |
2.2 - The Delphi Code Basically we use a tStrings to analyze the text, looking for the different markers using - Copy for start of line extraction
- Pos for the other lookups
2.2.1 - The Class definition The worker Class is very simple
Type c_remove_marked_up_text= Class(c_basic_object)
m_c_original_list: tStringList;
m_c_result_list: tStringList;
Constructor create_remove_marked_up_text(p_name: String);
Procedure remove_marked_up_text;
Destructor Destroy; Override;
End; // c_remove_marked_up_text |
2.2.2 - The main filtering loop
The main loop is also very simple - we read each line
- we call Functions which test one of the marker rule, and
- decides how to transform the line (mainly only a keep or throw away choice)
- returns True if the line has been handled in this Function
Here is the main loop:
Procedure c_remove_marked_up_text.remove_marked_up_text;
Var l_list_index: Integer;
l_the_line, l_trimmed_line: String;
Function f_end_of_file: Boolean;
// -- ooo Procedure read_line;
// -- ooo Procedure add_result_line;
// -- ooo
Function f_remove_start_of_line_slash_comments: Boolean;
// -- ooo Begin // remove_marked_up_text
m_c_result_list.Clear; l_list_index:= 0;
read_line;
While Not f_end_of_file Do
Begin
If f_remove_start_of_line_slash_comments Then Else
If f_remove_parenthesis_star_comment Or f_remove_brace_comment Then Else
If f_remove_ds_ending_slash_comments Then Else
If f_remove_ds_endig_slash_minus_comments Then Else
If f_remove_ds_middle_line_slash_comments Then Else
Begin add_result_line;
read_line; End;
End; // while l_the_line
If l_the_line<> ''
Then add_result_line; End; // remove_marked_up_text
|
2.2.3 - Example of start of line marker We handle the "//" at the start of the line in the following way:
Function f_remove_start_of_line_slash_comments: Boolean;
// -- True if has a "__//xxxxxx" line Procedure erase_slash_greater_bloc;
// -- a "__//> ... __//< bloc Begin
Repeat read_line;
Until f_end_of_file Or (Copy(l_the_line, l_index, 3)= '//<');
read_line; End; // erase_slash_greater_bloc
Begin // f_remove_start_of_line_slash_comments
If Copy(l_the_line, l_index, 2)= '//'
Then Begin
Result:= True;
// -- keep if "// --- "
If Copy(l_the_line, l_index+ 2, 5)= ' --- '
Then Begin
add_result_line;
read_line;
End Else
If Copy(l_the_line, l_index, 3)= '//>'
Then erase_slash_greater_bloc
Else read_line;
End
Else Result:= False;
End; // f_remove_start_of_line_slash_comments |
2.2.4 - Example of middle of line marker
We remove the middle "//" with code like this:
Function f_remove_ds_middle_line_slash_comments: Boolean;
// -- "xxx // yyy" // -- CAUTION: no // within a string
Var l_slash_slash_position: Integer; Begin
l_slash_slash_position:= Pos('//', l_the_line);
If (l_slash_slash_position> 1)
And Not (
(Pos('end; // ', LowerCase(l_the_line))> 0)
Or
(Pos('begin // ', LowerCase(l_the_line))> 0)
) Then Begin
Result:= True;
Delete(l_the_line, l_slash_slash_position,
Length(l_the_line)+ 1- l_slash_slash_position);
add_result_line; read_line;
End
Else Result:= False;
End; // f_remove_ds_middle_line_slash_comments | Please note that
- this code works since we removed the start of line "//" and ending "//-", "//>", "//<" before (see the main loop). So our code is depending on the calling order of our filtering functions
2.2.5 - The main Form
The main form is quite standard. Here is a snapshot of this form: where - the help is a memo saved in a .txt file (can be filled by the user)
- the favorites are loaded from a .txt file (same function as a filter combo box, but with a fixed format instead of a drop down)
- the source code to filter is defined by its path and file name
- the destination is defined by its path (with same file name), and the purple edit can be used to create the sub folder
- selecting a source .PAS file
- loads the text in the "original_" memo (where it still can be modified
and saved to disc)
- computes the filtered result
- this result is presented in the "result_" memo (where we still can modify and save it)
3 - Unit Test to Test the Code Cooker
To test that our filtering routines correctly remove the marked up code, we wrote a unit test Class. The test Class definition is
Type c_remove_marker_test= Class(c_test_case)
Private
m_c_remove_sncf: c_remove_marked_up_text;
Protected
Procedure Setup; Override;
Procedure Teardown; Override;
Published
// -- all the tests
Procedure test_slash_comment;
Procedure test_slash_comment_enclosing;
Procedure test_slash_eol_comment_enclosing;
Procedure test_slash_keep_comment;
Procedure test_slash_remove_comment;
Procedure test_remove_eol_slash_comment;
Procedure test_remove_parenthesis_star_comment;
Procedure test_keep_parenthesis_star_comment;
Procedure test_remove_brace_comment;
Procedure test_keep_brace_comment;
End; // c_remove_marker_test |
and, as an example, here is the test of the enclosing "//>" and "//<" markers:
Procedure c_remove_marker_test.test_slash_comment_enclosing; Begin
With m_c_remove_sncf, m_c_original_list Do
Begin Add(' CCC');
Add(' //>');
Add(' bbb');
Add(' ccc');
Add(' //<');
Add(' DDD'); remove_marked_up_text;
Check(m_c_result_list.Strings[0]= m_c_original_list[0],
'"enclosing //> //<", removed line 0') ;
Check(m_c_result_list.Strings[1]= m_c_original_list[5],
'"enclosing //> //<", removed line 5') ;
Check(m_c_result_list.Count= Count- 4+ 1,
'"enclosing //> //<", count') ;
End; // with m_c_remove_sncf
End; // test_slash_comment_enclosing | The result of the test looks like:
4 - Comments and Improvements
This tool suits our needs, because it neatly fits our coding conventions. However, it can easily be improved. You might - change the markers (for instance, use "//%" instead of "//-")
- use more conspicuous markers. We decided to make them as discreet as possible, since we do not want to notice them while developing the code
- change the semantics (what you want to filter out or keep)
- add additional filtering rules (or remove some of our own)
It could also be useful to add some checking proceudres (to check that the "//>" and "//<" count matches etc)
The filter is similar to weaving (we consider two aspects of the same code), but our process is unidirectional (there is no easy way to integrate customer changes in the original source). There are some obvious drawbacks:
- we are only removing areas. It would be quite tedious to remove variables or some procedure parameters
- there is no check as to the consistency of the filtering (we could remove a
class method definition, and not remove its implementation, or remove a Uses name, only to find out after compilation that some imported information is still used in the filtered code)
- the technique is invasive (the original code is modified by inserting markers to perform the filtering)
In our working tool we also added the removal of procedure declaration,
implementation and call. We hand over a list of procedure name, and the filter removes them everywhere. The nice thing is that this code is not invasive. It requires however a lexcical analyzer which could be avoided in the filter
presented in this article. Based on the success of this article, on time, and on popular demand ( :) ) we could (well, cook up, and then) publish this procedure filtering tool, with its
scanner, parser and unit test code, in a companion article on this site. As a last note, we used the term "cooking the code", as a nod to the very funny "cooking the book" expression.
5 - Download the Source Code Here are the source code files: The .ZIP file(s) contain: - the main program (.DPR, .DOF, .RES), the main form (.PAS, .DFM), and any
other auxiliary form
- any .TXT for parameters, samples, test data
- all units (.PAS) for units
Those .ZIP - are self-contained: you will not need any other product (unless expressly mentioned).
- for Delphi 6 projects, can be used from any folder (the pathes are RELATIVE)
- will not modify your PC in any way beyond the path where you placed the .ZIP (no registry changes, no path creation etc).
To use the .ZIP:
- create or select any folder of your choice
- unzip the downloaded file
- using Delphi, compile and execute
To remove the .ZIP simply delete the folder.
The Pascal code uses the Alsacian notation, which prefixes identifier by program area: K_onstant, T_ype, G_lobal, L_ocal, P_arametre, F_unction, C_lass etc. This notation is presented in the Alsacian Notation paper. The .ZIP file(s) contain:
- the main program (.DPROJ, .DPR, .RES), the main form (.PAS, .ASPX), and any other auxiliary form or files
- any .TXT for parameters, samples, test data
- all units (.PAS .ASPX and other) for units
Those .ZIP
- are self-contained: you will not need any other product (unless expressly mentioned).
- will not modify your PC in any way beyond the path where you placed the .ZIP
(no registry changes, no path outside from the container path creation etc).
To use the .ZIP: - create or select any folder of your choice.
- unzip the downloaded file
- using Delphi, compile and execute
To remove the .ZIP simply delete the folder. The Pascal code uses the Alsacian notation, which prefixes identifier by program area: K_onstant, T_ype, G_lobal, L_ocal, P_arametre,
F_unction, C_lass etc. This notation is presented in the Alsacian Notation paper.
As usual:
- please tell us at fcolibri@felix-colibri.com if you found some errors, mistakes, bugs, broken links or had some problem downloading the file. Resulting corrections will
be helpful for other readers
- we welcome any comment, criticism, enhancement, other sources or reference suggestion. Just send an e-mail to fcolibri@felix-colibri.com.
- or more simply, enter your (anonymous or with your e-mail if you want an answer) comments below and clic the "send" button
- and if you liked this article, talk about this site to your fellow developpers, add a link to your links page ou mention our articles in your blog or newsgroup posts when relevant. That's the way we operate:
the more traffic and Google references we get, the more articles we will write.
6 - References The presentation and source code of the Unit Test project is presented in
the Unit Test Framework article
7 - The author Felix John COLIBRI works at the Pascal
Institute. Starting with Pascal in 1979, he then became involved with Object Oriented Programming, Delphi, Sql, Tcp/Ip, Html, UML. Currently, he is mainly
active in the area of custom software development (new projects, maintenance, audits, BDE migration, Delphi
Xe_n migrations, refactoring), Delphi Consulting and Delph
training. His web site features tutorials, technical papers about programming with full downloadable source code, and the description and calendar of forthcoming Delphi, FireBird, Tcp/IP, Web Services, OOP / UML, Design Patterns, Unit Testing training sessions. |